Labels:text | font | screenshot | information OCR: Util ,, (s ,, a)) = (1 - @)Util, (s ,, a) + a Reward (s, a) + y max Util,(51+1, b). Figure 7: The RL rule for updating the estimated utility is a number less than one that determines the rate of change of the estimate. Note that the second part of the equation is similar to the equation in Figure 4, except there are no expectation signs E anywhere.